Search CORE

301 research outputs found

Meta Reinforcement Learning with Latent Variable Gaussian Processes

Author: Deisenroth Marc Peter
Hofmann Katja
Sæmundsson Steindór
Publication venue
Publication date: 17/05/2018
Field of study

Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e.g., robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing learned concepts from a set of training tasks to unseen, but related, tasks. Often, this relationship between tasks is hard coded or relies in some other way on human expertise. In this paper, we frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data. We apply our framework in a model-based reinforcement learning setting and show that our meta-learning model effectively generalizes to novel tasks by identifying how new tasks relate to prior ones from minimal data. This results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

UCL Discovery

Spiral - Imperial College Digital Repository

Do more placement officers lead to lower unemployment? : evidence from Germany

Author: Hainmueller Jens
Hofmann Barbara
Krug Gerhard
Wolf Katja
Publication venue
Publication date
Field of study

"In this paper we examine the effect of a pilot project of the German Federal Employment Agency, where in 14 German local employment offices the caseload (number of unemployed per caseworker) was significantly reduced. Since the participating local offices were not chosen at random, we have to take into account potential selection bias. Therefore, we rely on a combination of matching and a difference-in-differences estimator. We use two indicators of the offices' success (unemployment rate, growth of the number of SCIII clients). Our results indicate a positive effect of a lower caseload on both outcome variables." (Author's abstract, IAB-Doku) ((en))Arbeitsvermittlung - Erfolgskontrolle, Arbeitsvermittler - Modellversuch, berufliche Reintegration - Quote, Arbeitslosenquote, Arbeitsvermittlerquote

Research Papers in Economics

Automatic Curriculum Learning For Deep RL: A Short Survey

Author: Colas Cédric
Hofmann Katja
Oudeyer Pierre-Yves
Portelas Rémy
Weng Lilian
Publication venue
Publication date: 28/05/2020
Field of study

Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.Comment: Accepted at IJCAI202

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Fast Context Adaptation via Meta-Learning

Author: Hofmann Katja
Kurin Vitaly
Shiarlis Kyriacos
Whiteson Shimon
Zintgraf Luisa M
Publication venue
Publication date: 01/01/2019
Field of study

We propose CAVIA for meta-learning, a simple extension to MAML that is less prone to meta-overfitting, easier to parallelise, and more interpretable. CAVIA partitions the model parameters into two parts: context parameters that serve as additional input to the model and are adapted on individual tasks, and shared parameters that are meta-trained and shared across tasks. At test time, only the context parameters are updated, leading to a low-dimensional task representation. We show empirically that CAVIA outperforms MAML for regression, classification, and reinforcement learning. Our experiments also highlight weaknesses in current benchmarks, in that the amount of adaptation needed in some cases is small.Comment: Published at the International Conference on Machine Learning (ICML) 201

arXiv.org e-Print Archive

Oxford University Research Archive